precise localization
Road Damage and Manhole Detection using Deep Learning for Smart Cities: A Polygonal Annotation Approach
Hossen, Rasel, Mistry, Diptajoy, Rahman, Mushiur, Hridoy, Waki As Sami Atikur Rahman, Saha, Sajib, Ibrahim, Muhammad
Urban safety and infrastructure maintenance are critical components of smart city development. Manual monitoring of road damages is time-consuming, highly costly, and error-prone. This paper presents a deep learning approach for automated road damage and manhole detection using the YOLOv9 algorithm with polygonal annotations. Unlike traditional bounding box annotation, we employ polygonal annotations for more precise localization of road defects. We develop a novel dataset comprising more than one thousand images which are mostly collected from Dhaka, Bangladesh. This dataset is used to train a YOLO-based model for three classes, namely Broken, Not Broken, and Manhole. We achieve 78.1% overall image-level accuracy. The YOLOv9 model demonstrates strong performance for Broken (86.7% F1-score) and Not Broken (89.2% F1-score) classes, with challenges in Manhole detection (18.2% F1-score) due to class imbalance. Our approach offers an efficient and scalable solution for monitoring urban infrastructure in developing countries.
No Need to Look! Locating and Grasping Objects by a Robot Arm Covered with Sensitive Skin
Bartunek, Karel, Rustler, Lukas, Hoffmann, Matej
This work has been submitted to the IEEE for possible publication. No Need to Look! Locating and Grasping Objects by a Robot Arm Covered with Sensitive Skin Abstract-- Locating and grasping of objects by robots is typically performed using visual sensors. Haptic feedback from contacts with the environment is only secondary if present at all. The main novelty lies in the use of contacts over the complete surface of a robot manipulator covered with sensitive skin. The search is divided into two phases: (1) coarse workspace exploration with the complete robot surface, followed by (2) precise localization using the end-effector equipped with a force/torque sensor . We systematically evaluated this method in simulation and on the real robot, demonstrating that diverse objects can be located, grasped, and put in a basket. The overall success rate on the real robot for one object was 85.7% with failures mainly while grasping specific objects. The method using whole-body contacts is six times faster compared to a baseline that uses haptic feedback only on the end-effector . We also show locating and grasping multiple objects on the table. This method is not restricted to our specific setup and can be deployed on any platform with the ability of sensing contacts over the entire body surface. This work holds promise for diverse applications in areas with challenging visual perception (due to lighting, dust, smoke, occlusion) such as in agriculture when fruits or vegetables need to be located inside foliage and picked. Perception for robot manipulation has been dominated by visual inputs from cameras (RGB) or depth cameras (RGB-D). Classical methods have been used for object segmentation and pose and shape estimation to feed the synthesis of grasp proposals for a robot hand (e.g., [1]).
Augmented Reality without Borders: Achieving Precise Localization Without Maps
Puigjaner, Albert Gassol, Aloise, Irvin, Schmuck, Patrik
Visual localization is crucial for Computer Vision and Augmented Reality (AR) applications, where determining the camera or device's position and orientation is essential to accurately interact with the physical environment. Traditional methods rely on detailed 3D maps constructed using Structure from Motion (SfM) or Simultaneous Localization and Mapping (SLAM), which is computationally expensive and impractical for dynamic or large-scale environments. We introduce MARLoc, a novel localization framework for AR applications that uses known relative transformations within image sequences to perform intra-sequence triangulation, generating 3D-2D correspondences for pose estimation and refinement. MARLoc eliminates the need for pre-built SfM maps, providing accurate and efficient localization suitable for dynamic outdoor environments. Evaluation with benchmark datasets and real-world experiments demonstrates MARLoc's state-of-the-art performance and robustness. By integrating MARLoc into an AR device, we highlight its capability to achieve precise localization in real-world outdoor scenarios, showcasing its practical effectiveness and potential to enhance visual localization in AR applications.
Precise localization within the GI tract by combining classification of CNNs and time-series analysis of HMMs
Werner, Julia, Gerum, Christoph, Reiber, Moritz, Nick, Jörg, Bringmann, Oliver
This paper presents a method to efficiently classify the gastroenterologic section of images derived from Video Capsule Endoscopy (VCE) studies by exploring the combination of a Convolutional Neural Network (CNN) for classification with the time-series analysis properties of a Hidden Markov Model (HMM). It is demonstrated that successive time-series analysis identifies and corrects errors in the CNN output. Our approach achieves an accuracy of $98.04\%$ on the Rhode Island (RI) Gastroenterology dataset. This allows for precise localization within the gastrointestinal (GI) tract while requiring only approximately 1M parameters and thus, provides a method suitable for low power devices